The Application of Restricted Counter Schemes to Three Models of Linear Search

نویسندگان

  • Micha Hofri
  • Hadas Shachnai
چکیده

The mechanism of the Counter Scheme (CS) has been shown to be an effective statistical approach for the reorganization of linear lists, where the records in the list are referenced independently with a time homogeneous multinomial distribution. In this paper we show that derivative schemes can be used effectively in other contexts as well. Specifically, we consider (a) linear lists that are doubly-linked, so that they may be accessed at both ends; (b) multilists, which result from dissecting a linear list into several pieces which are accessed independently and reside in WORM (write-once-read-many store), and (c) reorganizing a disk, by copying its contents to another disk, so as to minimize the expected seek time required to access a record. 1. List Reorganization with the Restricted Counters Scheme − Concepts and notation Finding an expedient order for the elements of a linear list is a well-studied problem. Most of the work in the area considered the following elementary data structure: L = {R1, . . . , Rn} is a linear (singly-linked) list of n records, initially linked in an arbitrary order. The set of records is fixed in time, with no additions or deletions. The requests to access the list obey the independent reference model (irm), and use a reference-probabilities vector (rpv) p ≡ (p1, . . . , pn) where pi is the fixed (time-homogeneous) probability that an access request is for the record Ri , 1 ≤ i ≤ n. Typically, each access requires a sequential search starting at the head of the list, until the specified record is encountered. We define the cost of a reference to an element in position j in L as j. The objective of managing the list is to minimize the expected cost of access. In the ideal case where the access probabilities are all known, this model calls for an optimal static arrangement of the records in decreasing order of their reference probabilities, i.e. pi > p j <==> σ o(i) < σ o( j), where σ o(i) denotes the location of Ri in the optimal permutation σ o of the list L. Ties are resolved arbitrarily. However, it is more common that no initial information is available on the rpv, and the sequence of requests may be considered as a learning process. Then, during a history of † Department of Computer Science, University of Houston, Houston Tx 77204-3475, USA. ‡ Department of Computer Science, Technion − Israel Institute of Technology, Haifa 32000, Israel. Hofri, Shachnai: Restricted Counter Schemes... 2 references, the list is dynamically reorganized by a permutation rule, which may use any information accumulated during the process to achieve the desirable order. Three reorganization methods give rise to a large set of rules presented in previous work: Move To the Front (MTF) shifts the accessed record to the head of the list, leaving the relative order of the other elements unchanged. Transpose (TR) advances the accessed element one step ahead by an interchange with its immediate predecessor. The Counter Scheme(CS) keeps a reference count for each record, which tracks the number of references to it. The list is maintained sorted in nonincreasing order of the counter values. A comprehensive account of the policies considered in last two decades appears in [6]. Some variations and more recent analyses of the above rules appear in [3,5,8,10,12]. We note that all the reorganization methods allow us to model the list order as an ergodic Markov chain of n! states. Let Cm(PR|p) and C(PR|p) denote the expected access cost to the list using the permutation rule PR and the rpv p, after the mth request and in the limiting state respectively. The initial order of the list plays a role in Cm(PR|p) only; the limiting value is independent of it. In most of the published works, and in all the analyses and results shown below, the initial state, where it matters, is assumed to be uniformly distributed over all n! possibilities. Courcoubetis and Weber present in [4] a chain inequality, which characterizes the relations between any pair of the four ordering methods mentioned above in the limit of a long reorganization sequence: (1.1) C(OPT |p) = C(CS|p) ≤ C(TR|p) ≤ C(MTF |p) ≤ 2 C(OPT |p) . Both the MTF and TR are memory free, and they do not produce convergence to the optimal ordering for nonuniform rpv’s. Moreover, for any reference sequence which keeps referencing at least two distinct keys, both rules never stop reordering the list. On the other hand, the CS, which asymptotically attains the minimal average cost, suffers from a space problem, since the counters are unbounded. Hence it is impractical for long reference sequences and large values of n. There are however two ways of curbing the space requirements of the CS. Both produce suboptimal rules, where the departure from optimality can be controlled by parameters. One approach is the Terminating Counter Scheme (TCS), according to which the list is reorganized for a finite, predetermined number of times, m. The TCS guarantees that the expected access cost is within a factor of (1 + α ) of the optimal access cost. It was introduced in [8], and we show there1 that m = n(n − 1)(1 + α )/16α 2 . The second approach, called the Limited Counters Scheme (LCS), also analyzed in [8], is defined there as follows: Each record Ri 1 ≤ i ≤ n, is associated with a frequency counter Ci , which may not exceed the value cmax. As long as Ci < cmax, it is incremented at each request to Ri . At the same time Ri is shifted forward if necessary, so as to precede any R j with C j < Ci (we can only have such C j = Ci − 1). When Ci reaches cmax it remains fixed, and the location of Ri is uneffected by any subsequent references to the list. Hence, the dynamic reorganization process involves at most ncmax changes in record positions. If cmax = 1, the LCS produces the same expected access cost as the MTF , i.e. Special situations—in particular, extra information about p—may admit substantially lower bounds. Hofri, Shachnai: Restricted Counter Schemes... 3 (1.2) Cm(LCS|p, 1) = Cm(MTF |p) ∀ m ≥ 1. The asymptotic access cost to the list under LCS satisfies the relation (1.3) C(LCS|p, c) C(OPT |p) ≤ 1 + a≥1 max (a − 1)[ c r=0 Σ( r )a − 1⁄2(2c c )a] (1 + a)2c , which is bounded by 1.2175 already for cmax = 3. (Proofs of equations(1.2) and (1.3) appear in [8]). Some exact computations of that relation, for a few known distribution functions lend strong support to the conjecture that LCS performs well, with very modest space requirements. In the following sections, we consider the use of TCS or LCS on three other models of sequential search. 2. The CS for Doubly Linked Lists An immediate elaboration of the above model is the doubly linked list. We assume the following layout: A set of n records {Ri , 1 ≤ i ≤ n}, identified uniquely by their keys, is held in a doubly linked list D. Each element Ri is accessed with fixed probability pi , and the search for it may begin either at the “left” end of D, with probability p iL , or at the “right” end, with probability p iR = 1 − p iL . A reference in this scheme specifies both a key and a starting point for the search. The access cost, as above, is defined to be the number of key-comparisons required to locate a record in the list. The referenced key and the starting point for each search are both chosen independently of the past accesses or all previous states of the list. This may typically result when the searching mechanism takes part also in other activities, and does not serve this reference string only. The initial ordering of D is assumed random, with equal probability for each permutation. Let p = (p1, p1L , . . . , pn, pnL) be the rpv. When p is known, the average access cost is minimized when D is kept in a static optimal ordering, which is in decreasing order (from left to right) of the values (2.1) pi = pi iL − /2) = iiL − p iR ) 2 , as shown by Matthews et al., in [11]. From now on we assume that p is unknown, and D is dynamically rearranged during the reference sequence. Some special degenerate instances of p are considered in [11], for which the strategies adopted for singly linked lists are as effective in the present context. Lemma 2.1: ([11]) i) If p iL = p = 1⁄2, 1 ≤ i ≤ n, then no rearrangement is necessary or can improve the expected access cost. ii) If p iL = p > 1⁄2, 1 ≤ i ≤ n, and H is an optimal policy for the corresponding sequential list of n elements when p = 1 (which behaves precisely as the singly linked list), then H is optimal for D as well. (In [7] we proved that H = CS). We examine the general case, in which there exists at least one pair of indices 1 ≤ i, j ≤ n, such that p iL ≠ p jL . For that case, two memory-free rules were presented: Hofri, Shachnai: Restricted Counter Schemes... 4 Move To the End (MTE) − an accessed record is moved to the position where the search for it began. Transpose Toward End (TTE) − an accessed record is shifted one step towards the start point of its search. A result, analogous to the one derived by Riv est [13] for linear lists, is stated in Lemma 2.2: ([11]) If C(OPT |p) is the minimum expected search cost when D is optimally arranged, from left to right, then (2.2) C(MTE |p) < 2 C(OPT |p) . We now define an equivalent to CS for doubly linked lists (DCS), which keeps a difference count Di for each record Ri , 1 ≤ i ≤ n: Di = CiL − CiR, where CiL and CiR denote the number of searches for Ri starting at the left end and at the right end respectively. Di is changed whenever Ri is accessed: when search starts at the left end, Di is incremented, otherwise it is decremented. The DCS maintains the list in nonincreasing order of Di from left to right. Lemma 2.3: The average access cost to D under DCS after the mth request , m ≥ 1, satisfies (2.3) Cm(DCS| p) = 1 + (n − 1) (1 − n

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Kullback-Leibler distance for performance evaluation of search designs

This paper considers the search problem, introduced by Srivastava cite{Sr}. This is a model discrimination problem. In the context of search linear models, discrimination ability of search designs has been studied by several researchers. Some criteria have been developed to measure this capability, however, they are restricted in a sense of being able to work for searching only one possibl...

متن کامل

Ridge Stochastic Restricted Estimators in Semiparametric Linear Measurement Error Models

In this article we consider the stochastic restricted ridge estimation in semipara-metric linear models when the covariates are measured with additive errors. The&nbsp;development of penalized corrected likelihood method in such model is the basis for derivation of ridge estimates. The asymptotic normality of the resulting&nbsp;estimates are established. Also, necessary and sufficient condition...

متن کامل

Stochastic Restricted Two-Parameter Estimator in Linear Mixed Measurement Error Models

In this study, the stochastic restricted and unrestricted two-parameter estimators of fixed and random effects are investigated in the linear mixed measurement error models. For this purpose, the asymptotic properties and then the comparisons under the criterion of mean squared error matrix (MSEM) are derived. Furthermore, the proposed methods are used for estimating the biasing parameters. Fin...

متن کامل

Application of Tabu Search to a Special Class of Multicommodity Distribution Systems

Multicommodity distribution problem is one of the most interesting and useful models in mathematical programming due to its major role in distribution networks. The purpose of this paper is to describe and solve a special class of multicommodity distribution problems in which shipment of a commodity from a plant to a customer would go through different distribution centers. The problem is t...

متن کامل

Application of Tabu Search to a Special Class of Multicommodity Distribution Systems

Multicommodity distribution problem is one of the most interesting and useful models in mathematical programming due to its major role in distribution networks. The purpose of this paper is to describe and solve a special class of multicommodity distribution problems in which shipment of a commodity from a plant to a customer would go through different distribution centers. The problem is t...

متن کامل

P´olya Urn Models and Connections to Random Trees: A Review

This paper reviews P´olya urn models and their connection to random trees. Basic results are presented, together with proofs that underly the historical evolution of the accompanying thought process. Extensions and generalizations are given according to chronology: • P´olya-Eggenberger’s urn • Bernard Friedman’s urn • Generalized P´olya urns • Extended urn schemes • Invertible urn schemes ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1991